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GLYPHLETS 

BACKGROUND OF INVENTION 
This invention relates to providing ^yphs in a text document 
Acharscter is the smallest companent of written language having smantic value. 
5 A dmracteor refers to an abstract meaning and/or abapep xsHhea: than a specific shape, A 
glyph is a representation of a character. Aglyph image is the actual concrete image of a 
glyph rq>rcsmtation having been rasterized or otherwise imaged onto some display 
sur&ce* 

An encoded character is a character fiiat is associated with an encoding valuer for 

10 example, a scalar value included in a character set standard such as ASCII (American 
Standard Code fox InfoimatioQlnteichange) or An encoding value nu^s to a set 

of character attributes de£mingsenianticinfoccriation of t^ Characterset 
standards are defined by standards organizations: for example, the ASCII standard is 
defined by ANSI, and the ISO Standard 8859 is defined by ISO (International Standards 

15 Qiganization). Character set standanls are generaUy revised fiom time to time. Typically, 
when acharacta: set standard is defined, the encodnig valms are simultaneously defined. 

Character attributes can include one or more of the following: character case, 
character combining class, dharacter directionality, character num^c value, mathematical 
character, diaracter language letter character, a^habetic character and ideogr^hic 

20 chsFQctar. Other character attnbutes are possible. 

A glyph can be associated with a srt of glyph attributes defining q>pearance 
iiifbimationfixrarqxreseaiationofthecoireEpondingch^ Glyph attributes can 
include one or more of the following: glyph sbq>e, glyph metrics, type&ce name, glyph 
baseline and glyph kerning. Gen^aUy, glyph attributes provide the information necessary 

25 to render the glyph image. 

Afont is a collection of glyphs and a corresponding encoding m^[sping. Afont is 
typically constmcted to support a character set standard. That is, fonts include glyphs 
representing characters included m the character set standard. When the charactea: set 
standard is revised, the font manuficturer may need to revise the font to accommodate the 

30 changes, including the addition of new glyphs. In that case, a new font is re-issued 
coxiforming to the iiew character set standard. Revising fonte is costly for the designer 
and inconvenient for users who must track v^ons of the font and determine wheOier or 
not they have fonts supporting the latest character set standard. 
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Text docmncnls tj^icaUy include a text string tJiat one or more encoding 

values that re{»escnt characters in the text An oicoding value can nuqp to a character in a 
character art standard and to a glyph in a font constnicted to support 
standard. Thus, a text enghie (e.g., a word processing application) processing an 
5 electronic documesit that includes a text string of encoding values can obtain character 
attribute infhimation about an ocoded character represented by the encoding value by 
mq)pingfte€atKX)ding value to the diaracter set stm The text engine can also raider 
a tepr^entation of the charactei; that is, a glyph im^e, based on glyph attribute 
infonnation obtained fiom a q)ecified font using the same encoding value. The encoding 

10 value-attribute associations are typically available for a text engine to reference by 

loddng them up in fixed and static tables, indexed by encoding value. The attributes are 
not part of the document itself but are usually bnilt into the text engme or the operating 
systcan used by the plication. 

A character can be processed based on its diaracter attributes and/or glyph 

15 attributes. For exaniplc!, a layout «aigine that is settmg text in vertical wiit^ 

handle nnmerals in a specialized way, or might handle a currency symbol drffiermtly than 
numerals in some contexts. As another exan[Q>le, attributes can be critical for iI^nIt 
methods, as the user may need to choose the character based on the corresponding glyph's 
radical, stroke-count or pronunciation (e,g,, a software agent used to assist selecting 

20 Chinese/Japanese characters). Thus, for a rqnresentation of a character (/.e , a glyph) to 
particq;>ate fully in an electronic document, the charact^ and glyph attributes of the 
character and corresponding glyph must be accessible by a text engine processing the 
electronic document 



25 SUMMARY 

The present invention provides methods and qiparatus, including conqmter 
program products, for processing and constructing an electronic text docume^^ In 
general, in one aspect, ttie invention features providing an electronic document including 
a string that includes one or more references and parsing the string to identify a reference. 

30 Based on the identified refer^ice; a glypUet is identified incliiding a set of cbazactor 
attributes defining semantic infonnation of a character and a set of glyph attributes 
defining ^pearancemfornoiarion for a representati^^ One or more 
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charaxiter attributes or glyph attrib 
in the electronic document 

hnplementations of the mvention caa include one or more of tiie following. The 
atriqg can inchide a plurality of references conqmsmg one or more in-band values defined 
5 in an encodiz^ standard Parsing the string to identify a leferance can mcludeinl^^ 
a plurality of in-band values to define tiie identified iiefereoce. Mecineting a phirality of 
in-band values to define the identified reference can include identifyiiig one or more 
target attributes, and identifying a glyphlet based on the identified referaice can inchide 
identifying a glyphlet in a collection of glyphlets based on the identified target attributes. 

10 The coUectian of glyphlets can be embedded withm the electronic documcaxt or can be 
esctcnial to the electronic document 

Alternatively, the phmdity of in-band values can define the identified glyphlet la 
another alternative, the plurality of m-band values can identify a location e>ctemal to the 
electronic document fixmi ^ch the identified glyphlet can be retrieved. 

15 hi anotfao* implementation, the string can include one or more references 

cono^msing one or more outH>f-4)asdvah2es not defined in an enoo<^ The 
identified refopmce can mclude one or more of the out-of-ba^ If ttie identified 

glyphlet is embedxled withhi the electronic do cument, the one or more out-of-band vahies 
can be directly associated with die identified glyphlet Identifying a glyphlet based on the 

20 identified reference can inchide identifying one or more target attributes based on die 
identified refemce and identifying a glyphlet in a collection of glyphlets based on the 
identified target attributes. The coUectionofglyphlets can be embedded within flic 
electronic docoment or external to the electronic document The one or more out-of-band 
values can identify a location extenud to Ihe electrwic document 

25 identified glyphlet can be retrieved. 

The set of charact^ attributes can include one or more character attributes 
selected fiom the group consisting of character case, cfaaract^ category, character 
ccmibining class, character directionality, character numeric valuer mathematical 
character, character language, letter character, alphabetic character and ideogr^hic 

30 character. The set of glyph attributes can inchuleoiic or more g^yph attribute 

fitnn the groi^ consistiiig of glyph shape, typogrq>hic weight, 4qpogrq)hic width, slant, 
number of stroloes, glyph metrics, type&ce name, glyph baselme and glyph kerning. 

The identified glyphlet can be retrieved fiom a memory external to the electronic 
document, and can be retrieved fiom a collection of glyphlets. Alternatively, Ihe 
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identified glypUrt can bo rdrievedficm storage embedded within the electronic 
document, including from a coUectioiL of glyphlets stoied within tbe electronic docmneEnt 

In gmexal, in another asp ect, tbe invention f eatuies meBiod and q>paratus for 
constnictmg a text document User input is received selecting a diaracto:, and a glypblet 
5 coiTeq)onding to die sdected character is identified The g^yphlet includes a set of 
dbiaracter attributes defining semantic infimnation of tbe selected character and a set of 
glyph attributes defining qrpearance infbmiation far a glyph representative of fiic selected 
chaFBCtCT* A reference to the identified glypU^ is iziserted into a text doGim 

Inqplementadons of the invention can include one or more of the fi)llowing. Hie 
10 identified glyphlet can be embedded in the text document. The user input selectmg a 
(^acter can include user itq[>ut sdecting a g^yph sh^ The 
refereaice to the identified glypblct can include one or more in-band values defined in an 
encoding standard. The one or more in-band values can define one or more target 
attributes uniquely idendfybg the idwtified glyphlet in a collection of glyphlets. 
15 Altomatively, the one or more in-band values can define the identified glyphlet 

The reference to the identified glyphlet can include one or more out-of-band 
values not defined in an encoding standard. The one or more out^f-band values cm be 
associated with one or more target attributes imiquelyideatifymg the identified glyphl^ 
ma collection of glyphlets. Alternatively, iftiie identified glyphlet is embedded in the 
20 text document, tb& one or more out-of-b and values can be directly associated with the 
identified glyphlet 

In genoral, hi another aspect, the invention features method and ^aratus for 
representing a character in a text document A reference identifying a glyphlet is inserted 
into the text document The identified glypMet includes a set of diaracter attributes 
25 defining semantic information of a character and a set of glyph attributes defining 
appearance information for a representation of the character. 

Inq)lwmtationsofthe invention can inctade one or rriore of the follow^ Hie 
identified glyphlet can be embedded in the text document The reference can include one 
or more in-band values defined in an mcoding standard tiiat are inteipreted to identify 
30 one or more target attributes &om whidh tbe identified glyphlet can be identified. 

Alternatively, the reference can include one or more in-band values that are inteipreted to 
define the identified glyphlet 

In another implemoitationi the reference can include one or more out-of*band 
values not defined in an oKodingstandanL Hie one ormoieout-of-band values can 

4 
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identify one or more target attributes, and theiefereace can identify &e glyphlet from a 
coUeetioix of glyphlets based on the one or more taiget attributes. Alternatively, if the 
identified glyphkt is embedded in the teoct docmnent, die one ormore out-of-band values 
can be directly assodated widi the idratified glyphl^ 
5 The set of character attributes can include one or more character attnbutes 

selected fiom the gfoxxp consisting of character case, character case, character combining 
class, character dhectionalily, character numeric valuer ma&ematical character, character 
language, lettcff character, a^hsbetic character and ide^ The set of 

glyph attributes can incliide one or more g^yph attributes selected fiom the group 

10 consisting of glyph shape, typogTE^o wdg^ typogn^c width, slant, number of 
strokes, glyph metrics, type&ce name, glyph baseline, and glyph kCTing. 

In general, in another aspect the invention feato^ AglypUetisa 
data structure stored on a coflxqyuter readable medimxL The data structure includes 
character data representing one or maxt character attributes defining semantic faifi>rmation 

15 of a character and glyph data represeaiting one or more glyph attributes defining 
appearance information for a representation of the character. 

loqilementations of the inveotion can include one or more of the following. The 
one more character attributes can include one or more character attributes selected 
fiom die groiq) consiating of character case^ diaracter category, character combining 

20 class, character directionality, character numeric value, mathonatical diaracter, character 
language letter character, ai^Aabetic character and ideograpU The one or 

more glyph attributes can include one or more glyph attributes selected fiom die gnoiq) 
consisting of glyph du^e, typogr^hic weight; typogrq>Iuc width, slant, number of 
strokes, glyph metrics, type&ce name, glyph b aseline, and glyph kerning. 

25 In general, in another aspect, die invention features an electronic document stored 

on a Gon^)Uter readable medium* The electronic document includes electronic data 
defining a string that includes one or more references identifying glyphlets. A glyphlet 
includes a set of character data representing one or more character attributes defining 
semantic information of a character, and a set of glyph data representing one or more 

30 glyph attributes defining appearance infimnation fi>r a representation of the character. 

Inq>lementations of the invention can iachule one or more of ttie fi)llow]ng. The 
electronio docummt can fiirther include electtonic data deSnmg a collection of one or 
more glypUetsidmtlSed by the references in the string. A reference can include one or 
more m-band values defined in an encoding standard that are mtesipreted to identify one 

5 
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or moto targ^ attributes &om vMch a glyphlet can be identified* Alternatively, a 
leference can include one or more in-band values diat aie interpreted to define a glyphlet 

Ta another imploientation, a refmuM can include one or n:iore out-of-band 
values not defined in an encoding standard. The one or more out-of-band values can 
5 identify one or more target attributes, and the referaice can identify a glyphlet &om a 
collection of glyphleta based on the one or more target attributes. The electronic data can 
define a collection of one or more glyphlets identified by ttie refeaiences in the stringy and 
<me or more out-of-band values can be directly associated ^itfa one or more gilyphlets in 
the collection. 

10 The set of chaFacter attributes can include one or more character attributes 

selected fiom die group consistbg of diaracter case^ diaracter category, character 
combining class, character directionality, character nmnedc value, mathematical 
character, character language, letter character, alphabetic character and ideogr^hic 
character. The set of glyph attributes can include one or more ^yph attributes selected 

15 fiom the group consisting of glyph shq»e, typographic weight, typograqphic width, slant, 
number of strokes, glypb medics, type&ce name^ gtyph baselme, and g^yph kerning. 

Hie invention can be inqilemated to realize one or more of the following 
advantages. Because a glypMet (including glypb attributes aiid character 
be identified based on areference, a text engine can access gjyph and diaracter attribute 

20 infonnarion about the glyphlet without retiance on a specific encoding u The text 

engine can process the glyphlet folly as any glyph included in an encoding nuking to a 
characto set standard. Identifying a glyphlet based on a referem:e to a set of attributes 
adds a level of searchable access to glyphs beyond die traditional one-to-one encoding 
msqfyping. A target glyph can be stored esctemal to a font A fimt can be expandable by 

25 having access to additional g^yph shq>es when used in conjunction with a collection of 
one or more glyphlets, which glyphlets are accessible by a text engme processing a 
document that mchides tenet m the font. The ability to use a collection of glyphl^ in 
coqunction wifli a font can eliminate Ihe nmd to create, distribute and install a revi sed 
font including additional glyphs, which is cost-efTective, convcnieait and efificifflt to both 

30 font n:ianu&cturer3 and users. 

The details of one or more embodiments of die invention arc set forOi in the 
accon:^>anyi]ig drawing and the d^cription below. Other features and advantages of die 
invention will be apparent fix>m the description, the drawings, and die claims* 
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DBSCRIPnON OF DRAWINCfS 
FIQ 1 repixsenlB a text string including mi^^ value xeferrace and multiple 
encoded cbaracterB. 

FIG2 is a flowchait showing a process for pn}ce8sii^ text that 
5 reference to a glyphleL 

FIG 3 is a flowchart showing a process for idmtifying a glypUet based on a 
reference to a set of attributes. 

FIG 4 is a flowchart showing a process for constructing a text document including 
aglypblet 

10 FIG S is a flowchart showing a process for identifying a glyph referred by a 

dynamic font 

Like reference symbols in the various drawings indicate like elements. 

DETAILED DESCRIPTION 
A glyph that does not represent a character in a diaracter set 

1 5 not have an encoding value in a fi>nt and is therefore not associated with a charact^ in a 
character set standard, is sometimes refmed to as a gaiji (a Jipanese term meaning 
*%>reiga glyph"). Because the character set staiidard does not include a nixing fiom an 
mcoding vahe to a set ofcharacto: attribute for a gaiji, &ere are no cha^ 
associated with the gaijL Cunxntly,ifa user wants to insert a gaiji into a document, a 

20 separate graphic image of the gaiji^s giyph shape can be created and inserted into a gap in 
the I^ut of the text as displayed by a text-processing application. Theduplayofthe 
glyph sh^e gives the illusion that the gaiji is part of ttie text but the text's underlying 
string itselfdoes not iiKdude&e gaiji or a reference to the gaij Furtho; the gaiji cannot 
particulate in text-processing activities, for exanq)lep select, find/replacep flpell-chedc and 

25 the like, as glyphs represootting encoded characters can, beca^ 

appUcation has no infbrmi^onabom the character attributes assod If 
the gaiji is not included in a font the text-processing application may also have no 
mformatiQn about the g^yph attributes associated with the gaiji. 

If charactcar and glyph attributes are stored with a glyph, a text engine need not . 

30 depend on any specific encoding to determine the character and glyph attributes 
associated with the glyph. Accordingly, such a glyph can partidpateftiUy as a g^ 
representing an encoded character, even though the glyph is not mapped to a set of gilyph 



7 



wo 2004/012099 



FCT/US2003/024111 



attributes by the font, and is ]K>t mapp^ to a set of character attributes in a character siet 
standard. 

Aglyphlet is a set ofsJjph attributes and a set of chatacterattr^ Glyph 
attributes deifine appcaiance information &r a rqrosentation of a character, and can 
tncfaide: £^yph sh^pe, typographic weight, t)rpographic widtib, slant, nimib^ of strokes » 
glyphinetrics, typeface name, glyph baseline and glyph Character attributes 

define semantic infonnation of a chaiactet; and can include: character case, character 
category, character combining class, character directionality, character numeric value, 
madhematical character, character language^ letter character, alphabetic character and 
ideogr^hic character. The glyph and character attrifa^ lists above arc not eidi^ 
and other attributes are possible* 

A glyphlet provides a direct relationshq> between a representation of a character 
(/.e, a glyph) and tile assodated character attributes and gl^^ Thecharacter 
and glyph attributes can be accessed if the identity of the corraponding glyphlet is 
known. Bycontrast^aglyphinchidedinafbntisonlyindiiwtlyrelatedtoasrt 
character attributes associated with the glyph. The character attributes are defined by flie 
character set standard, as described above. The relationship between a gjiyph in a fimt and 
a set of character attributes in a character set standanl is Ifaearefbre 
encoding mappmgofihe character srt standard and the £mt Aside from fliisrclationshq), 
tlie glyph and set ofdiaracter attributes are independent of one ano&er, Iflfae 
relationship does not eorist, far exaDq>le, if a font includes a glyph that does not have a 
correspondmg niq^ing in tim chanu:ter s^ standard used to c(^^ 
engine cannot access character attribute information for the glyph, l^thout accessible 
character attributes, the glyph cannot partidpate folly as a character witbhi a teoct 
document 

A glyphlet can be iniplemented as a data structure storing charact^ attributes and 
gilyph attributes, or surrogates for them, e.g;, pointers or indices* In one in^lementation, 
a glyphlet is packa^ as a'^afof ' (OpenTVpe) structured font, including at least two 
tables: one for foe glyph shape and one for the metadata including the characta' and glyph 
attributes. Tlie m^adata table is an indcKcd list of atbibute key-value entries. Hie 
glyphl^ can be queried for an attribute by searching the mrtadata list for entries whose 
key matches the desired attribute. The g^yphlet's metadata can also be pre*fetched and 
cached in a database, ^(liich can be queried more efSciently foan inspecting each 
glyphlet's sfot structure directly. 
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A glyphlet included in a text document can be represeooted by a * teference'' 
included in a text string: a text string can references in addition to encoding 
values rqxresentmg encoded eharactera^Prooesaing an electronic document including a 
glyphlet can be iniplemeated using a text engine configured to lecogoize a reference fixnn 
which a glyphlet can be identified, and to interpret tihie reference to identify the glyphlet 
as described further below. 

A reference &om which a glyphlet can be identified can be an out-of-band value 
not defined in an encoding standard. For exan:q)le^ a leferonce can be an integer value 
recognizable by a text engine as referring to a glyphl^ that is not included in an encoding 
mappixig. The out-of-band value can be directly associated with a glyphlet embedded 
witiiin the eleotronio document Alternatively, ttie out-of-band vahie can be associated 
with information mdicadng where a glyphlet can be found extanal to the text document, 
for example, an address to a server &om which the glyphl^ can be downbaded. 

In another alternative, the out-of-band vahie can be associated with one or more 
target attributes, embedded within tiie electronic document, that uniquely identify a 
glyphlet The target attributes can fem the basis ofa query used to quay a coU 
glyphl^&om which die glyphlet can be identified The collection of glyphlets can be 
embedded within the document, or infermadon can be embedded in the documoit 
indicating an external store where the collection of glyphlets can be found 

hi any event, the reference is selected to be recognizable by an appropriately 
configured text engine as a reference fiom which a glyphlet can be identified, such that 
die text engine must look somewhoe other than a character set standard for attribute 
infijnnation necessary to proc^ the glyphlet For exanq)le, an integer or range of 
integCT can be srt adde for use as reference only, sudi that a text engi^ 
string including encoding values mduded m an encoding standard and references, will 
recognize an integer within the ran^ as a reference and identify a g|yphl(^ accordingly* 

In another implementation, the reference can be one or more in-band values 
defined in an encoding standard The m-band values can define a glyphl^ that is, 
include the glyph attributes and character attributes. Alternatively, the iorband values can 
identify yAieic a glyphlet can be found external to the text document, for exanq>le, an 
address to a server fiom y/inoh the glypblet can be downloaded 

In another alternative, the in*band values can define one or more targ^ attributes 
that can be used to identify the glyphlet The target attributes defined by the in-band 
value can be used to form a query. Usmg the query, the glypWrt can be identified fiom a 
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collection of gtyphlets: the glypblet including one or more attributes satisfying the query. 
As discussed above, the collection of gl}phl^ can be mbedded in the text document, or 
can be found in an external store identified by information eni^^ 
document. 

5 In-band values can include^ for example, an XML string mchiding one or more 

tag^el<»nQQls. Ilie tagged elements can be attribute-value pairs defining tar^ 
attributes. FIG 1 shows a representation of atext string 100 including several encoded 
characters 105 and an tn-bandrefoeoce 110 fiomwUc±i a s^ypUet can be i^^^ JaB, 
text document, the text string 100 would be formed of encoding values, for exanq)le, 

10 ASCII values associated with each of the encoded characters lOS and the encoded 
characters included in the in-band value refermce 110. For illustrative purposes, the 
representative characters are shown in FIG 1. 

In this exan^le, the reference 110 includes attribute information fiom ^ch a 
gjyphlet can be identified. For exan^le^ the character category (charcat) attribute has the 

15 value "Currency Symbor, As described above, the attributes defiiied by the reftrence 
110 can be used to form a quay. A collectian of glyphl^ can be queried and a glyphlet 
havixig one or more attributes satisfying file query can be identified Ag^yph shape 
iiqpresenttxig the character having the attributes defined 110 can be 

rendered using gilyph attribute infinmationaocessib^ In this 

20 exanqilei, the glyph sh^ for the Euro symbolic 120, can be rendered and displayed by a 
display device, such as a monitor or printer, along wifii glyph diapes representing the 
oicoded characters 105 inchided in the text string 100, 

-One ormore glyphl^ canbe included m a cadie mctuded within a text 
document, and, as discussed above, a reference can be a pointer ot oflTset to a glyphlet 

25 within the cache. Alternatively, a reference can be a pointer to a location in memo^ 
storing the glyphlet 

FIG 2 shows a process for retrieving attribute infixrmationfiom a glyphlet, for 
example, usiiig a text raigfaie processing a text document As discussed above, strings 
within file text document include both encoding valu^ associated with encoded 

30 characters, and refbmicesassodated with glyphlets. In the first step, a reference is 
encountered in a striiig aiid recoginzed as a reference fix)m 

identified (Step 205). The recognition occurs because^ for example, the reference is an 
out-of-band value integ^ within a range of integers reserved for rdbcnces, or because 
Iherefeence is one or more in-band values dealing targ^al^ Ifthe 

10 
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text engine leqioTes chaiacter or glyph attribute infarmation for iho glyphlet; then the text 
engme queries the glyphlet (Step 210). Attribute information is retrieved from the 
glyphlet in response to the query (Step 215). For cKan^lei, a text engine might be tasked 
to baUd a concordance for a text documeaity which requires a count of woids ^thin Ihe 
5 documeoL Lsi ordo: to identify a woiid formed by a string the text engmen^^ 

retrieve character attribute information for a glyphlet represented by a reference inchided 
intfaestring. Far example, the text engbe may need to detemine the pronunciatira 
tiae glyphlet, in order to identify the word. The st^ including Ihe glyphlet can then be 
processed based on the retrieved attribute mformation (Step 220). 

10 As discussed above, a reference can be used to identify one or moic target 

attributes, from which a taiget glyphlet can be identified. FIG 3 shows apnxxss for 
idoitifymg a glyphlet based on a re&rsice to a set of attributes. For illustrative pmposes, 
consider a text CTgmepiocesaing a text documot The text e^e receives a reference, 
wliich can be included in a string among encoding values associated with encoded 

15 charactos, and recognizes the reference as refening to a set of attributes (Step 305). As 
discussed above, foe refer^ice can be recognizable as re&ring to a set of attributes 
• because, for exan^le, the rcferesK» is an integer within a r^ 
for references. Altematiyely» ttie reference can specify attributes, for example, the 
refoience can be a tagged XML string (included in the text strmg^ 

20 attributes^ To identify a glypU^assodatedwifo the reference, foe text engine genera 
query based on foe set of attributes identified by foe refmoce (310). 

A collection of one or more gjyphlets is fom queried, to idodify a glyphlet 
including one or more attributes satisfying foe query (315). It is possible that two or more 
glyphlets satisfy a qu^, in which case user can be presented a visual representation of 

25 foe g^ypU^ and procqpted to mate a choice. Alternatively, it is possible that no ^yphlet 
satisifies a query* la this cas^ the text engine can be configured to alert foe user that no 
con:eq>onding glyphlet is available, e.^„ by displaying an error messa^ or, if appropriate, 
inserting a defkult glyph at tbo relevant location in foe text The collection of glyphlets 
can be a cache of glyphlets included in foe text document, or can be stored in memory, 

30 separate fiom foe text document A glyphl^ including one or more attributes satisfying 
foe query is idcmtified (Step 320). The text engine can then access information fiom foe 
glyphlet to rmder foe glyph (le, glyph attributes) or otherwise process foe glyphlet (I^l, 
diaracter attributes). 
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The mdcpendeace of a glyphlet fiom a character set standard or a font allows new 
characters to be introduced into a text processing q)pUcati Byway, 
of illustrative exan^le, consider the introdoctioa of a coonnon currency in Emcpe^ which 
xeqoired a new typogrqiMc symbol to rq>resent die new conunon European curreacy^ the 
'IBuro". Most existing character set standards and fonts did not include the symbol € 
representing the Euro* Character set standards were eventually changed and new fonts, 
confbmung with die revised character set standards, were develop ed and distributed 

Anodier {^pcoach to the introduction of the new Euro symbol would have been to 
distribute a glyphlet represenlii:^ die € symbol, including glyph attributes and character 
attributes. The glyphlet can be stored by an operatfaig system. In one inqjlementation, a 
text engine can build a mam or palette indicating characters available for use in a t&xt 
document fiom a collection of one or more such glyphlets stored by the operating system. 
The text engine can qu^ the glyphlets and generate corresponding menu items, which a 
usercanuse to select the indicated characto: The character may be indicated by a 
gFE^hic representation, i.e.. a ig^yph shape raidered based cm glyph flttrihute in^Krmfltion 
obtained fiom the corresponding glyphl^ or by one or more character attributes unique 
to the glyphlet, such as the character name. By selecting a menu item, a user can insert a 
refersice to a glyphlet coiresp ending to the character indicated by the menu item into a 
text document 

In another imploneniation, a menu item can be associated with one or more 
character attributes, ra&er than associated with a glyphlet Ifthe menu item is selected, 
then Based on the one or inore associated character attributes, the te^ 
a query and query a collection of one ot more glyphlets, which can be stored by the 
operating system. If a niatch is fouzid, that is, a glyphl^ including one more diaracter 
attributes satisfying the query, then the glyphl^ can b e used to render a glyph diape or 
odierwise process the glyphlet 

To illustrate the above, consider the Euro exan:q>le. Figure 4 shows a process for 
constructmg a text document included a reference to a glypMe^ In this example^ the 
word processmg application has a menu of characteos, which includes a r^resentation of 
the Euro symbol, such as a gr^hic image of the symbol, the name of the symbol, i\e 
Euro, or some other attribute clearly identifying tixe Euro (Step 405). User input is 
received selecting tiie Euro, for cXBxnplc, by highligjhting and diddng on the menu item 
representing tiie Euro (Step 410), 
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A g^yphlet concsponding to the selected character is tiien identified (Step 413). 
For txjmple, the menu item can be associated with a pointer that can be followed to a 
memory location where the glyphlrt is stded. Alternatively, the one or more taiget 
attributes can be stored at the memciylocatioii, from ^cih a Tho 
5 query can be used to query a collection of glypblets to identify a target ^yphlrt having 
one or more attnbutes satisfying the query. 

Qace a g^yphl^ is identified, a reference to ihc identified glyphlet is inserted into 
thete3ctdocmnfiDt(Stq;»420). As discussed above, the reference can be an out-of-band 
value or one or more in-band values. A glyph sfaq>e rendered fitmi the glyph attributes 

1 0 included in die target glyphlet can be displayed in a display o f the text document, for 
exa]Z4>le,onamonitororbyapcinten The refeannice pan be subsequently used by a text 
CTgine processing the text document to identify the target glyphlet. 

A recipient of the text document processing the text-document using a d^ 
text en^e and a difTercnt CQnq)uter may find a reference in the text document 

15 meaningless, particularly if the recipient's text engine does not have access to the 
glyphlet. To avoid this dtuation, the g^ypUet can be embedded within the ^ 
and thus accessible to a redpicait of the text documieat In another iuD^lementation, the 
reference can provide a text engine processing &e docomait infomiation about where to 
obtain a corresponding glyphl^ fin: exan^>le, an address where the ^yphlet can be 

20 retrieved fiom a servec 

As discussed above, if a chanicter set standard is revised, then a font coD^ 
under a previous version of the chaiact^ set standard may be outdated and considered 
inconqilete and non-confbnnant To bring a font into con&rmance with die revised 
character set standard requires revisions to the font, and distribution and installation of a 

25 new font to usees ofthe existing font By using one or more glyphl^, the glyph ahe^ 
accessible by a JEbnt can be expanded without requiring a revised font to be distributed 
and installed by font users. 

One approach to expanding the glyph shapes accessible by a font is to provide 
font users with a collection of one or mom glyphlets coneaponding to change made to 

30 the underlying character set standard. For example, when a new character is added to the 
character set standard, rather than distribute a revised font fiiat inchides a glyph image for 
ttie xiew character and shares an encoding vabe rnq>^^ 

character's attributes in ttie character set standard, anew glyphlet for the new character is 
distributed to font users that can be used hi CGtgunctian with the exis^ Thenew 
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glyphlet includes glyph attribiites fi>r lendoiing s glyph shape repiesenting tfao new 

character; with any stylistic features necess^ Thenew 

glyphirt also includes character attributes correspon^^ The 

character attnl)utes inim)! flK>se iacluded in the revised ch^ 

corresponding to the new chaiact^, hut are accessible by querying the new glyphl^ 

rather tfmn by an mcod^ value nouipping to the character set stand^^ Thus, the font in 

conjunction with the new glyphlet perfoxms just as if a revised font inchiding a glyph 

linage for the new chaxacter and a corresponding inappiijig to the n^ 

character set standard were being used, hi this manneri additional glyph shq>e8 can 

easily, quiddy and inexpensively be accessible widiout necessitating issuance of a revised 

form of &e font 

Another appro ach to expandkg the glyph sh^es accessible by a font using a 
glyphlrtistoindudeam^ppingtooneormoreiefereQcesinafin^ Theoneormore 
refoences can be used to identify one ormoreglyphlets. As discussed above, the 
reference can be uniquely associated wi& a glyphlet, or can identify one or more targ^ 
attributes that can be used to identify a target glyphlet 

By way of illustrative exanq>le^ consider the introduction of the new fypographic 
symbol to represent the 'Tuid'', discussed above. Before the symbol € became 
representative of the Euro, it was known that anew common curreiusy would be 
established, and tbaX a new symbol would ttkely be necessary to represent the cuzrmcy. 
Accordingly, tfam was a time period during whidi font manuficturers released fonts 
knowing that they would soon be rendered out-dated by the introduction of the new 
currency symboL A character set standard flx>m which a font was constructed may have 
beenrevised to include an eaocoding value and character attributes £3r the Euro, altfaou^ 
tiie representation ofthe Euro symbol had not yet been deternuned One sohrtikm would 
have been to include a mapping in the font ftom the Burn encoding vahie to a lefetence to 
a set of attributes, ^ch attributes could later be used to form a query fitnn which a Euro 
glyphlet could be identified. For exanq>l6, the set of attributes could include the 
"currency symboP* attribute. Later, once the symbol had hcea determined, the font 
manu&cturer could provide a new glyphlet, including an epprapnato set of glyph 
attributes and character attributes, fixr example, a name attribute having the value ^'Euro'* 
and a ^currency symbor attribute. 

nG 5 shows a process for CTcpandrng the glyph shapes accessible b ^'dynamic'* 
finit. Using the above illustrative example, the first step inchides receiving a dynamic 
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fijnt; that is a fbxit tiiat includes a m£{^ 

glyphletlmvi2iga'*cwreacys)mboratt^^ The fent does not include an 

encoding value mapped to a glyph wiHiin the font &r the € symbol, because the font was 
released before it was known fliat the € symbol would represent the Euro. 
5 Once it is Imown the €i^bolrqxresentsflie Euro, the fo 

glyphlet rq;n:esendng the Euro charactdr^ including a set of glyph attributes and character 
attributes. The Euro glyphlet is provided to users ofthe font and instaUed(£.g; stored in 
memoiy)p rather ttmn requiring installation of a new font alto A user 

can then create a text document inchiding the€ symbol, for example, by choosmg the 

10 synd}olfitmxadropdownmem2onateKtpn>cessing^H^ The 
appficaHon inserts in the document the re&rence to a s^ of one or mare attributes, 
sudi as the ""currency symbol" attribute, fiom which the Euro glyphlet can be identified. 

Atext en^e processing a text document including the € symbol encountm the 
reference and recognizes the nsforence as referring to a set of attributes fiom which a 

IS glyphlet can be identified The text mgnie then generates a query based on the set of 
attributes (Step SIS). Acache or collection of glyphl^ is queried (Step S20). The text 
engine identifies the Euro glyphlet fixxm the collection of glyphlets as having attributes 
satisfying (iLa, matching to some degree) the queiy (Step S2S)* The text engine can then 
render the glyph shcq^e, in this example the € symbol, or otherwise process the glyphlet as 

20 if a giypb had been included in the font's encoding and mapped to ibc underiying 
character set standard. 

hi one mqiilementation, a font can be formed entirely of a mapping to refisrences 
fiom which glypUets can be identified, as described above in the context o 
symbol exanqple. By using the font in conjunction with a collection of orie or more 

25 glyphl^s, glyphlets referenced by ttie font can be accessed to render glyph shq>es or 
otherwise process tlie glyphlets based on the ^yphlet's character and glyph attributes. 

The first time a text engme encounters a reference and identifies a glyphlet based 
on the refomce, the text engoie can store an identifio: for the glyphlet associated with the 
reference. The next time the refermce is ^icountered, text engine can identify the 

30 glyphlet based on Restored identifier, wi&Qut having to perfonn another For 
example, if the glyphlet includes a nam^ the text oigine can associate the name of the 
gjiyphlet with the refomce. The next time the refor^e is encounbared, the text engme 
can use the glypUet's xuune to access the glyphlet without requir^ 
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To summarise the above in the cooteoct of a £>iit-reiiderizig system, conaider a fynt' 
rendering system tasksed to reml^ a desi^ If the &nt includes 

anxq)pingtaaglyphdefiintion, thesj^rtemdisplayBagl)^ Ifllie font includes a 

mappbag to a reffomce from ^ch a glyphlrt can be idratified^ the system identifies the 
5 glyphlet based on the reference and displays a gly{diimage^ Hie reference can be 
associated with either a glyphlet or a set of attributes fiom which a glyphl^ can be 
identified, or can inchide the g^hlct, as discussed above. If the refomce identifies a set 
of attributes, the systemcan fimn a query based on the attributes, and search a store of 
glyphlets to find a glyphlet having attributes best noiatchi^ Thesystemcan 

10 task an external mechanism to perform the search. The eKtetnal mechanism retains to the 
system a glyphlet that includes attributes satisfying the qu^. The systom can cache the 
glyphlet in mraoiory and display tibe glyph image, or otherwise process the glyphlet based 
on the glyph and/or character atbributes. 

In one implementation, a Qrpe&ce can be fonnedetitro A 

15 glyphlet-based typefkce does not include any encoding vahies m£q;ip ed to glyphs in a font 
and characters in a character set standard. Radier,dietypeficeis£bim<^fioma 
collection of one or more glyphlets that can be identified by areference included in a, text 
document, as described above. The reference can be directly associated with a glyphlet, 
or can be associated with one or more target attributes fiom which a glyphlet can b e 

20 identified. Such a font can be usefiilfiyr glyphs ttuit are likely to change often* 

Hie invention can in^slemented in digital electronic circuitry, or hi computer 
haxdware, firmware, software, or in combmations of them. Apparatus of the inveaxtioii 
can be inylemented In. a csonqniter program product taqgihiy embodied in a Tnanhnift , 
readable storage device for execution by a programmable processor, and mettiod steps of 

23 the invmtion can be performed by a programmable processor executing a program of 
instructions to perfbrm functions of the invention by opmting on input data and 
generatmg output. The invention can be iDq>lemented advantageously in one or more 
computer programs that are executable on a programmable system including at least one 
pio9:Hmniable processor coupled to receive data and instnictions &om, and to transmit 

30 data and instructions to, a data storage system, at least one hiput device, and at least one 
output device, Badh cona{)uter program can be inq>lemented in a high-level pioceduzal or 
object-orimted programming language, or in assembly or machine language if desked; 
and in axiy case, the language can be a conquled or inteipieted language. Suitable • 
processors include by way of example, both general and special purpose 
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microprocessoiB, Geaerally> a processor will receive instructions and data fiom a read- 
only memoiy and/or a random access memoiy. Gtmerally, a coDDtputcr will include one or 
more mass storage devices fi>r storing data files; such devices include magn^c disks^ 
such as internal hard disks and removable disks; a magneto-optical disks; and optical 
disks. Storage devices suitable for tangibly embodying conqmter program instnictionB 
and data include all forms of non-volatile memory, including by way of exazxiple 
soniconductor memory devices, sudi as EPROM, BEPROH and flash memory devices; 
magnetic disks such as internal hard disks and removable disks; magneto-optical disks; 
and CD-ROM disks. Any ofthe foregoing cm be siqjplemcnted by, or ixicorporated in, 
ASICs (appHcationnspedfic integrated circuits). 

To provide for interaction with a user, the invention can be implemented on a 
computer system having a displ^ device such as a monitor or LCD screen for displaying 
infixtmation to the user and a keyboard and a pouitcng device such as a mouse or a 
trackbaU by winch the user can provide ii^ut to the computer system. Thecoiiq>uter 
system can be programmed to provide a gnqphical user inter&ce through which con^mter 
programs intefact with users. 

The invention has been desoibed in tecnxs of particular embodiniente OSber 
onhodiments are withm the scope of the knowing dai^ For example^ the steps of the 
inventioa can be performed in a difiG^rent order and still achieve desirable results. 
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What is claimed is: 

1 . A computer-iioplemeixted method for constructing a text document, the method 
comprising: 

recdving user ixsput selectijig a character (410); 
5 identifying a glyphlet coiresponding to the selected character (41S), ^ glyphlet 

including a set of character attributes defining sanaatic information of the selected 
character and a set of g^h attributes defining appearance infoimation for aglyph 
representative of the selected character; and 

mserting a refb:€nce to the identified glyphlet into a text document (420). 
10 2. The mediod of claim 1, wherein: 

user iiqiut selectmg a character (410) includes user input selecting a glyph shape 
rspresenting the character. 

3. The mediod of claim 1, wfaereni: 

the reference to the identified glyphlet (420) includes one or more in-band 
15 values (110) defined in an encoding standard. 

4. The method of claim 3, wherdn: 

fbc one or moze in-band values (110) define one or more target attributes uniquely 
identifying the identified glyphlet in a collection of glyphlets. 

5. The xnethodofclaim 3, wherein: 

20 the one or more in-band values (110) define the idmlified glyphlet 

6. The method of claim 1, wherein: 

the reference to die identified glyphlet (420) inchides one or more out-of-band 
values not defined in an encoding standard 

7. The method of claim 6, wh^ein: 

25 the one or more out-of-bandvahi^ are associated with one or more tazg<rt attributes 

uniquely idaoitifying the identified glyphlet in a collection of glyphlets* 

8. The method ofclaim 6, finther comprising: 
embedding tibie identified glyphlet in the text document; 

wherein the one or more out-of*'band values are directly associated with the 
30 idoitified glyphlet 

9. The method of daim 1, fiirOier comprising: 
embeddmg the identified glyphlet in the text document 
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10. The method of claim 1» ^K^ierein: 

the set of chaiacter attributes inchidcs one or more character attributes selected &om 
Ifae groiq) corisistiQg of character case^ character category; character combimng claaa, 
character directionality, character mmieric value, tnafhraiatical character, character 
language, letter character, aJ^habetic charact^ and ideogz^phic character. 

11. The method of claim 1, whorein: 

die set of gjiyph attributes includes one or more g^yph attributes selected &om the 
groi5> consistnig of glyph sh^ typographic weigjit, typographic width, slant, number of 
strokes, glyph metrics, type&ce name, glyph baseline, and glyph kerning. 

12. A glyphlet, comprising: 

a data structure stoifed on a con^utcr rgadable mediiim, die data structure including 
character data representmg one or more cfaarBCter attributes defining semantic information 
of a character and glyph data representing one or mote ^yph attributea Hfl fitiing 
qypearance information for a rqnesentation of the character. 

13. The glyphlet ofclaim 12, wherein: 

the one or more character attributes include one or more characto' attributes selected 
from the groiq> coxmsting of character case, character category, character combining 
class, character directionality, dtaracternumeric valuer ma&ematical charBcta*, character 
language, letter diaracter, alphabetic character and ideogrq)hic character. 

14. The glyphlet of claim 12, wherein: 

the one or more glyph attributes includes one or more glyph attributes selected fiom 
the grovp consisting of gilyph sh^e, typogrs^hic weight, typognq>hic width, slant, 
immb^ of strokes, glyph metrics, type&ce name, glyph baseline, and glyph kemisig. 

15. A computes program product, tangibly stored on a machine-readable medium, 
compiismg instructions cyp^able to cause a programmable processor to: 

obtain an electronic document inchiding a string (100) that inchides one or more 
references; 

parse the string (100) to identify a refiarence; 

based on the identified referrace, identify a glyphlet mcluding a set of character 
attiibates defining semantic information of a character and a set of glyph attributes 
defining appearance information for a r epre s e n tation of the character, and 

use one or more characto attributes ot glyph attributes fin: the identified glyphlet to 
process text in fiie electronic document 
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16. The computer program product of clann IS, wherein: 

the string indudes a plurality of references conqnising one or more in-band 
values (110) defined in an encoding standard. 

17. The computer program product of claim 16, ^(iierein: 

5 instructions operable to parse the string to identify areferaice faiclude instructions 

operable to intecpr^ a plurality of in-band vahies (1 10) to define the idoitified reference. 

1 8. The conqmter program product of claim 17, wfao^ein instructions operable to: 
interpret a plurality of in-band values (1 10) to define the idratified Tcferencc include 

instnictionsGpecable to identify one or more target attributes; and 
10 identify a glyphlet based on the identified reference include instructions operable to 

identify a glyphlet in a collection of glyphlets based on the identified target attributes. 

19. The codcputer pr ogram product of claim 17, wherein: 

the plurality of xa-band values (1 10) define fiie identified glyphl^ 

20. The computer program product of claim 17, wheran: 

15 the phnality of in-band values identify a location eictemal to the electronic 

document fi:om where liie identified gjyphlrt can be retrieved 

21. The conqiuter program product of claim 14, wherein: 

the string includes one or more references comprising one or more outrof-band 
values not defined in an socoding standard. 
20 22, The coinputerprogram product of claim 21, wherdn: 

Ifae identified reference includes one or more of the outrof-band values. 

23, Ilie coii]puterpn>grBm product of claim 22, wherenu 

the identified glypUrt is embedded in the electronic document; and 
the one or more out-of-band values are directly associated with the identified 
25 ^yphleL 

24. The computer program product of claim 22, wherein instructions operable to: 
identify a glyphlet based on the identified reference include instructions operable to 

identify one or more target attributes based on the identified reference and identify a 
glyphlet in a collection of glyphlets based on the identified target attributes. 
30 25. The compiiter program product ofclaims 18 and 24, "Q^ierein: 

the collection of glyphlets is embedded within the electronic document 
26. Hie coirqput^ program product of claims 18and24^ wfaerdn; 

the collection of glyphlets is stored exteroal to the electronio documoot. 
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27. lliecoii5>utcr program product of claim 

the set of character attnbutes includes one or more character attributes selected ftom 
the g[ovp consistmg of character case, character category, character combining class, 
character directionality, character numeric value, mathematical character, character 
language, letter character, alphabetic character and ideogrqihic character. 

28. The computer program product of claim 1 5, wherein: 

the set of glyph attributes inchides one ot more gjyph attributes selected ftmn tiie 
group consisting of glyph shape, typographic w^t, typographic width, slant, mmiber of 
strokes, glyph metrics, type&ce name, glyph baseline arid glyph koiii^ 

29. The con^juter program product of claim 1 5, furflier conqirismg instructions 
operable to: 

retrieve the identified glyphl^ fiom a memory external to the electronic document 

30. The c(m^er program product of claim 29, vvliereizi: 

the identified glyphlet is retrieved fiom a collection of glyphl^. 

31. The conq>uter program product of claim 1 5, fin:fiier comprising instructions 
operable to: 

retrieve the identified glyphlet fiom storage embedded within the electronic 
document. 

32. The cooqmter program product of claim 31, ^erem: 

the identified glyphlet is retrieved fiom a collection of glyphlets. 
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